[opt] Enable CFG optimization for local tensors #3237

strongoier · 2021-10-20T12:47:47Z

Related issue = #2590, #2637, #3218, #3228

Although #2637 introduced local tensors, related CFG optimization was not implemented, resulting in redundant local memory allocation/load/store in many cases. This PR enables CFG optimization for local tensors, and eliminates the overhead of #3218 and #3228. Let's look at a tiny example:

import taichi as ti

ti.init(dynamic_index=True, print_preprocessed=True, print_ir=True)

@ti.kernel
def my_func():
    a = ti.Vector([1, 2])
    print(a.sum())

my_func()

Before this PR, the final IR for the kernel is:

kernel {
  $0 = offloaded  
  body {
    <[Tensor (2) i32]> $1 = alloca
    <i32> $2 = const [1]
    <i32> $3 = const [0]
    <*i32> $4 = shift ptr [$1 + $3]
    <i32> $5 : local store [$4 <- $2]
    <i32> $6 = const [2]
    <i32> $7 = const [4]
    <*i32> $8 = shift ptr [$1 + $7]
    <i32> $9 : local store [$8 <- $6]
    <i32> $10 = local load [ [$4[0]]]
    <i32> $11 = local load [ [$8[0]]]
    <i32> $12 = add $10 $11
    print $12, "\n"
  }
}

After this PR, the final IR for this kernel is:

kernel {
  $0 = offloaded  
  body {
    <i32> $1 = const [3]
    print $1, "\n"
  }
}

Details:

Add a not-too-conservative alias analysis for PtrOffsetStmt, which is able to produce definitely same/different results.
Stop treating a TensorType alloca as a store to enable store-to-load forwarding. This is because currently local tensors must be initialized with values (ti.Vector([1, 2, 3])). Then we don't treat TensorType alloca itself as a valid forwarding source.
Stop exposing a PtrOffsetStmt with an alloca origin to other offloaded tasks (it shouldn't appear in final node live_in) to enable dead store elimination.

netlify · 2021-10-20T12:47:53Z

✔️ Deploy Preview for jovial-fermat-aa59dc canceled.

🔨 Explore the source changes: 0bd8dd7

🔍 Inspect the deploy log: https://app.netlify.com/sites/jovial-fermat-aa59dc/deploys/61704154ccbbcc00087fc383

strongoier · 2021-10-20T16:17:27Z

/format

ailzhang

Nice catch! thanks!

k-ye

Nice!

xumingkuan

LGTM!
(Just curious, is it possible that there is a local tensor without an initial value?)

strongoier · 2021-10-22T04:34:07Z

LGTM! (Just curious, is it possible that there is a local tensor without an initial value?)

Current it is impossible. But maybe we will support that in the future, and the CFG part will be updated accordingly.

strongoier added 4 commits October 20, 2021 17:36

Add alias analysis for PtrOffsetStmt

539fdd9

Enable store-to-load forwarding of local tensors

f6346f9

Fix live variable analysis with local tensors

5e03ee9

Fix: TensorType alloca is not a store

0240299

strongoier requested review from k-ye, squarefk and ailzhang October 20, 2021 12:47

Auto Format

0bd8dd7

k-ye requested a review from xumingkuan October 21, 2021 05:55

ailzhang approved these changes Oct 21, 2021

View reviewed changes

k-ye approved these changes Oct 21, 2021

View reviewed changes

k-ye merged commit 53e04c6 into taichi-dev:master Oct 21, 2021

strongoier mentioned this pull request Oct 21, 2021

[lang] Make dynamic indexing compatible with BLS #3244

Merged

xumingkuan reviewed Oct 21, 2021

View reviewed changes

strongoier deleted the cfg-local-tensor branch October 22, 2021 08:17

strongoier mentioned this pull request Dec 8, 2021

Support dynamic indexing in ti.Vector/ti.Matrix #2590

Closed

strongoier mentioned this pull request Jan 5, 2022

[opt] Get store_to_load_forwarding work with local tensors across basic blocks #3942

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[opt] Enable CFG optimization for local tensors #3237

[opt] Enable CFG optimization for local tensors #3237

strongoier commented Oct 20, 2021 •

edited

Loading

netlify bot commented Oct 20, 2021 •

edited

Loading

strongoier commented Oct 20, 2021

ailzhang left a comment

k-ye left a comment

xumingkuan left a comment

strongoier commented Oct 22, 2021

[opt] Enable CFG optimization for local tensors #3237

[opt] Enable CFG optimization for local tensors #3237

Conversation

strongoier commented Oct 20, 2021 • edited Loading

netlify bot commented Oct 20, 2021 • edited Loading

strongoier commented Oct 20, 2021

ailzhang left a comment

Choose a reason for hiding this comment

k-ye left a comment

Choose a reason for hiding this comment

xumingkuan left a comment

Choose a reason for hiding this comment

strongoier commented Oct 22, 2021

strongoier commented Oct 20, 2021 •

edited

Loading

netlify bot commented Oct 20, 2021 •

edited

Loading